Goto

Collaborating Authors

 exponential convergence


Wasserstein Contraction of Coordinate Ascent Variational Inference

arXiv.org Machine Learning

Finding approximations to an intractable probability distribution π of interest (usually known only up to a normalizing constant) is a key problem in scientific computing. Variational Inference stands out as a particularly attractive tool for this task, owing to its statistical and computational efficiency, and it has been the framework underlying many advances in computational statistics over the past half century (Parisi, 1980; Hinton and Van Camp, 1993; Jordan et al., 1999; Bishop and Nasrabadi, 2006). The central idea is to seek a tractable approximation to π within a chosen family of tractable distributions Q by minimizing a divergence to π over that'variational' family. Often, it is convenient or well-motivated to work with the family of product (or tensor, or factorized) distributions Q = P m, and define optimality through minimisation of the Kullback-Leibler (KL) divergence (also'relative entropy') min KL(ϱ||π): ϱ P m . A key practical aspect of working with this particular loss function is that in solving the associated optimisation problem, one is only required to compute expectations under the tractable variational distribution ϱ, rather than under the intractable target distribution π. In Bayesian statistics, π typically represents the joint posterior distribution of latent variables z Z and some parameters β B given observed data y Y. In these cases, we often choose m = 2 and seek the best variational approximation µ(dz) ν(dβ) to π to solve min KL(µ ν||π): µ P(Z), ν P(B) . The coordinate ascent variational inference algorithm (CAVI, Bishop and Nasrabadi, 2006; Blei et al., 2017) solves this problem by iteratively minimizing the Kullback-Leibler divergence with respect to one element at a time: given a starting point ν0, it iterates µk:= argmin



Robust Canonicalization through Bootstrapped Data Re-Alignment

arXiv.org Artificial Intelligence

Fine-grained visual classification (FGVC) tasks, such as insect and bird identification, demand sensitivity to subtle visual cues while remaining robust to spatial transformations. A key challenge is handling geometric biases and noise, such as different orientations and scales of objects. Existing remedies rely on heavy data augmentation, which demands powerful models, or on equivariant architectures, which constrain expressivity and add cost. Canonicalization offers an alternative by shielding such biases from the downstream model. In practice, such functions are often obtained using canonicalization priors, which assume aligned training data. Unfortunately, real-world datasets never fulfill this assumption, causing the obtained canonicalizer to be brittle. We propose a bootstrapping algorithm that iteratively re-aligns training samples by progressively reducing variance and recovering the alignment assumption. We establish convergence guarantees under mild conditions for arbitrary compact groups, and show on four FGVC benchmarks that our method consistently outperforms equivariant, and canonicalization baselines while performing on par with augmentation.



A Dimension-Decomposed Learning Framework for Online Disturbance Identification in Quadrotor SE(3) Control

arXiv.org Artificial Intelligence

Quadrotor stability under complex dynamic disturbances and model uncertainties poses significant challenges. One of them remains the underfitting problem in high-dimensional features, which limits the identification capability of current learning-based methods. To address this, we introduce a new perspective: Dimension-Decomposed Learning (DiD-L), from which we develop the Sliced Adaptive-Neuro Mapping (SANM) approach for geometric control. Specifically, the high-dimensional mapping for identification is axially ``sliced" into multiple low-dimensional submappings (``slices"). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional tasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without any pre-training or persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the full-state closed-loop system exhibits arbitrarily close to exponential stability despite multi-dimensional time-varying disturbances and model uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unknown disturbances and specific knowledge of the model.


Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)

arXiv.org Artificial Intelligence

This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.



From Text to Trajectories: GPT-2 as an ODE Solver via In-Context

arXiv.org Artificial Intelligence

In-Context Learning (ICL) has emerged as a new paradigm in large language models (LLMs), enabling them to perform novel tasks by conditioning on a few examples embedded in the prompt. Yet, the highly nonlinear behavior of ICL for NLP tasks remains poorly understood. To shed light on its underlying mechanisms, this paper investigates whether LLMs can solve ordinary differential equations (ODEs) under the ICL setting. We formulate standard ODE problems and their solutions as sequential prompts and evaluate GPT-2 models on these tasks. Experiments on two types of ODEs show that GPT-2 can effectively learn a meta-ODE algorithm, with convergence behavior comparable to, or better than, the Euler method, and achieve exponential accuracy gains with increasing numbers of demonstrations. Moreover, the model generalizes to out-of-distribution (OOD) problems, demonstrating robust extrapolation capabilities. These empirical findings provide new insights into the mechanisms of ICL in NLP and its potential for solving nonlinear numerical problems.


Neural Policy Iteration for Stochastic Optimal Control: A Physics-Informed Approach

arXiv.org Artificial Intelligence

We propose a physics-informed neural network policy iteration (PINN-PI) framework for solving stochastic optimal control problems governed by second-order Hamilton--Jacobi--Bellman (HJB) equations. At each iteration, a neural network is trained to approximate the value function by minimizing the residual of a linear PDE induced by a fixed policy. This linear structure enables systematic $L^2$ error control at each policy evaluation step, and allows us to derive explicit Lipschitz-type bounds that quantify how value gradient errors propagate to the policy updates. This interpretability provides a theoretical basis for evaluating policy quality during training. Our method extends recent deterministic PINN-based approaches to stochastic settings, inheriting the global exponential convergence guarantees of classical policy iteration under mild conditions. We demonstrate the effectiveness of our method on several benchmark problems, including stochastic cartpole, pendulum problems and high-dimensional linear quadratic regulation (LQR) problems in up to 10D.


Some remarks on gradient dominance and LQR policy optimization

arXiv.org Artificial Intelligence

Solutions of optimization problems, including policy optimization in reinforcement learning, typically rely upon some variant of gradient descent. There has been much recent work in the machine learning, control, and optimization communities applying the Polyak-Łojasiewicz Inequality (PLI) to such problems in order to establish an exponential rate of convergence (a.k.a. ``linear convergence'' in the local-iteration language of numerical analysis) of loss functions to their minima under the gradient flow. Often, as is the case of policy iteration for the continuous-time LQR problem, this rate vanishes for large initial conditions, resulting in a mixed globally linear / locally exponential behavior. This is in sharp contrast with the discrete-time LQR problem, where there is global exponential convergence. That gap between CT and DT behaviors motivates the search for various generalized PLI-like conditions, and this talk will address that topic. Moreover, these generalizations are key to understanding the transient and asymptotic effects of errors in the estimation of the gradient, errors which might arise from adversarial attacks, wrong evaluation by an oracle, early stopping of a simulation, inaccurate and very approximate digital twins, stochastic computations (algorithm ``reproducibility''), or learning by sampling from limited data. We describe an ``input to state stability'' (ISS) analysis of this issue. The second part discusses convergence and PLI-like properties of ``linear feedforward neural networks'' in feedback control. Much of the work described here was done in collaboration with Arthur Castello B. de Oliveira, Leilei Cui, Zhong-Ping Jiang, and Milad Siami.